Hierarchical System for Content-based Audio Classi cation and Retrieval
نویسندگان
چکیده
A hierarchical system for audio classi cation and retrieval based on audio content analysis is presented in this paper. The system consists of three stages. The audio recordings are rst classi ed and segmented into speech, music, several types of environmental sounds, and silence, based on morphological and statistical analysis of temporal curves of the energy function, the average zero-crossing rate, and the fundamental frequency of audio signals. The rst stage is called the coarse-level audio classi cation and segmentation. Then, environmental sounds are classi ed into ner classes such as applause, rain, birds' sound, etc., which is called the ne-level audio classi cation. The second stage is based on time-frequency analysis of audio signals and the use of the hidden Markov model (HMM) for classi cation. In the third stage, the query-by-example audio retrieval is implemented where similar sounds can be found according to the input sample audio. The way of modeling audio features with the hidden Markov model, the procedures of audio classi cation and retrieval, and the experimental results are described. It is shown that, with the proposed new system, audio recordings can be automatically segmented and classi ed into basic types in real time with an accuracy higher than 90%. Examples of audio ne classi cation and audio retrieval with the proposed HMM-based method are also provided.
منابع مشابه
Hierarchical classification of audio data for archiving and retrieving
A hierarchical system for audio classi cation and retrieval based on audio content analysis is presented in this paper. The system consists of three stages. The rst stage is called the coarse-level audio classi cation and segmentation, where audio recordings are classi ed and segmented into speech, music, several types of environmental sounds, and silence, based on morphological and statistical...
متن کاملClassification of general audio data for content-based retrieval
In this paper, we address the problem of classi®cation of continuous general audio data (GAD) for content-based retrieval, and describe a scheme that is able to classify audio segments into seven categories consisting of silence, single speaker speech, music, environmental noise, multiple speakers' speech, simultaneous speech and music, and speech and noise. We studied a total of 143 classi®cat...
متن کاملContent-Based Classi cation and Retrieval of Audio
An online audio classiication and segmentation system is presented in this research, where audio recordings are classiied and segmented into speech, music, several types of environmental sounds and silence based on audio content analysis. This is the rst step of our continuing work towards a general content-based audio classiication and retrieval system. The extracted audio features include tem...
متن کاملSemiautomatic Image Retrieval Using the High Level Semantic Labels
Content-based image retrieval and text-based image retrieval are two fundamental approaches in the field of image retrieval. The challenges related to each of these approaches, guide the researchers to use combining approaches and semi-automatic retrieval using the user interaction in the retrieval cycle. Hence, in this paper, an image retrieval system is introduced that provided two kind of qu...
متن کاملRetrieval by Content in Symbolic-Image Databases
Two approaches for integrating images into the framework of a database management system are presented. The classi cation approach preprocesses all images and attaches a semantic classi cation and an associated certainty factor to each object found in the image. The abstraction approach describes each object in the image by using a vector consisting of the values of some of its features (e.g., ...
متن کامل